Efficient learning strategy of Chinese characters based on network approach 
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Based on network analysis of hierarchical structural relations among Chinese characters, we de- 
velop an efficient learning strategy of Chinese characters. We regard a more efficient learning method 
if one learns the same number of useful Chinese characters in less effort or time. We construct a 
node- weighted network of Chinese characters, where character usage frequencies are used as node 
weights. Using this hierarchical node-weighted network, we propose a new learning method, the 
distributed node weight (DNW) strategy, which is based on a new measure of nodes importance 
that takes into account both the weight of the nodes and the hierarchical structure of the network. 
Chinese character learning strategies, particularly their learning order, are analyzed as dynamical 
processes over the network. We compare the efficiency of three theoretical learning methods and two 
commonly used methods from mainstream Chinese textbooks, one for Chinese elementary school 
students and the other for students learning Chinese as a second language. We find that the DNW 
method significantly outperforms the others, implying that the efficiency of current learning methods 
of major textbooks can be greatly improved. 



Introduction. It is widely accepted that learning Chi- 
nese is much more difficult than learning western lan- 
guages, and the main obstacle is learning to read and 
write Chinese characters. However, some students who 
have learned certain amount of Chinese characters and 
gradually understand the intrinsic coherent structure of 
the relations between Chinese characters, quite often find 
out that it is not that hard to learn Chinese pQ. Unfor- 
tunately, such experiences are only at individual level. 
Until today there is no textbook that have exploited sys- 
tematically the intrinsic coherent structures to form a 
better learning strategy. We explore here such relations 
between Chinese characters systematically and use this 
to form an efficient learning strategy. 

Complex networks theory has been found useful in di- 
verse fields, ranging from social systems, economics to 
genetics, physiology and climate systems [2HH]- An im- 
portant challenge in studies of complex networks in dif- 
ferent disciplines is how network analysis can improve 
our understanding of function and structure of complex 
systems [THS]- Here we address the question if and how 
network approach can improve the efficiency of Chinese 
learning. 

Differing from western languages such as English, Chi- 
nese characters are non-alphabetic but are rather ideo- 
graphic and orthographical [lOj. A straightforward ex- 
ample is the relation among the Chinese characters ' % ', 

' II ' and ' H ', representing tree, woods and forest, re- 
spectively. These characters appear as one tree, two trees 
and three trees. The connection between the composition 
forms of these characters and their meanings is obvious. 
Another example is ' rf\ ' (root), which is also related to 
the character ' % ' (tree): A bar near the bottom of a 



tree refers to the tree root. Such relations among Chinese 
characters are common, though sometimes it is not easy 
to realize them intuitively, or, even worse, they some- 
times may become fuzzy after a few thousand years of 
evolution of the Chinese characters. However, the over- 
all forms and meanings of Chinese characters are still 
closely related [H QTJ [12] : Usually, combinations of sim- 
ple Chinese characters are used to form complex charac- 
ters. Most Chinese users and learners eventually notice 
such structural relations although quite often implicitly 
and from accumulation of knowledge and intuitions on 
Chinese characters [13] . Making use of such relations 
explicitly might be helpful in turning rote leaning into 
meaningful learning [14] , which could improve efficiency 
of students' Chinese learning. In the above example of 
' % ', ' Iff ', and ' H ', instead of memorizing all three 
characters individually in rote learning, one just needs to 

memorize one simple character ' % ' and then uses the 
logical relation among the three characters to learn the 
other two. 

However, such structural relations among Chinese 
characters have not yet been fully exploited in practical 
Chinese teaching and learning. As far as we know from 
all mainstream Chinese textbooks the textbook of Bcl- 
lassen et al. [T] is the only one that has taken partially the 
structure information into consideration. However, con- 
siderations of such relations in teaching Chinese in their 
textbook are, at best, at the individual characters level 
and focus on the details of using such relations to teach 
some characters one-by-one. With the network analysis 
tool at hand, we are able to analyze this relation at a 
system level. The goal of the present manuscript is to 
perform such a system-level network analysis of Chinese 
characters and to show that it can be used to significantly 



improve Chinese learning. 

Major aspects of strategies for teaching Chinese in- 
clude character set choices, the teaching order of the cho- 
sen characters, and details of how to teach every individ- 
ual character. Although our investigation is potentially 
applicable to all three aspects, we focus here only on 
the teaching order question. Learning order of English 
words is a well studied question which has been well es- 
tablished [15]. However, there is almost no explicit such 
studies in Chinese characters. In this work, the charac- 
ters choice is taken to be the set of the most frequently 
used characters, with 99% accumulated frequency [To] , 
To demonstrate our main point: how network analysis 
can improve Chinese learning, we focus here on the issue 
of Chinese character learning order. 

Although some researchers have applied complex net- 
work theory to study the Chinese character network 
[T71 [T%] , they mainly focus on the network's structural 
properties and/or evolution dynamics, but not on learn- 
ing strategies. A recent work studied the evolution of rel- 
ative word usage frequencies and its implication on coevo- 
lution of language and culture [19] . Different from these 
studies, our work considers the whole structural Chinese 
character network, but more importantly, the value of the 
network for developing efficient Chinese characters learn- 
ing strategies. We find, that our approach, based on both 
word usage and network analysis provides a valuable tool 
for efficient language learning. 

Data and methods. Although nearly a hundred 
thousand Chinese characters have been used throughout 
history, modern Chinese no longer uses most of them. 
For a common Chinese person, knowing 3,000 — 4,000 
characters will enable him or her to read modern Chinese 
smoothly. In this work, we thus focus only on the most 
used 3500 Chinese characters, extracted from a standard 
character list provided by the Ministry of Education of 
China 20J. According to statistics [T5], these 3500 char- 
acters account for more than 99% of the accumulated 
usage frequency in the modern Chinese written language. 

Most Chinese characters can be decomposed into sev- 
eral simpler sub-characters pTJ [12] . For instance, as il- 
lustrated in Fig. [l] character ' '(means 'add') is made 
from ' ^ '(ashamed) and ' % '(water); ' ^ ' can then be 
decomposed into ' ^ '(head, or sky) and ' \ '(heart), and 
' ^ ' can be decomposed into ' — ' (one) and ' ^ '(a per- 
son standing up, or big). The characters 1 % ', ' *t! ', ' 

— ' and ' ^ ' cannot be decomposed any further, as they 
are all radical hieroglyphic symbols in Chinese. There 
are general principles about how simple characters form 
compound characters. It is so-called "Liu Shu" (six ways 
of creating Chinese characters). Ideally when for exam- 
ple two characters are combined to form another charac- 
ter the compound character should be connected to its 
sub-characters either via their meanings or pronuncia- 
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O Learning Q Learned O Unlearned 

FIG. 1. Chinese character decomposing and network con- 
struction. The numerical values in the figure represent learn- 
ing cost, which will be discussed later. 



tions. We have illustrated those principles using char- 
acters listed in Fig. [T] See Supporting Online Ma- 
terial for more details. While certain decompositions 
are structurally meaningful and intuitive, others are not 
that obvious at least with the current Chinese character 
forms P2| • In this work, we do not care about the ques- 
tion, to what extent Chinese character decompositions 
are reasonable, the so-called Chinese character rationale 
[11) . but rather about the existing structural relations 
(sometimes called character-formation rationale or con- 
figuration rationale) among Chinese characters and how 
to extract useful information from these relations to learn 
Chinese. Our decompositions are based primarily on Ref. 

[LIICEIIH]. 

Following the general principles shown in the above 
example and the information in Ref. [TTJ [T2 [H] , we de- 
compose all 3500 characters and construct a network by 
connecting character B to A (an adjacent matrix element 
clba = 1, otherwise it is zero) through a directed link if 
B is a "direct" component of A. Here, "direct" means to 
connect characters hierarchically (see Fig. [I]): Assuming 
B is part of A, if C is part of B and thus in principle C is 
also part of A, we connect only B to A and C to B, but 
NOT C to A. There are other considerations on includ- 
ing more specific characters which are not within the list 
of most-used 3500 characters but are used as radicals of 
characters in the list, in constructing this network. More 
technical details can be found in the Supporting On- 
line Material. Decomposing characters and building up 
links in this way, the network is a Directed Acyclic Graph 
(DAG), which has a giant component of 3687 nodes (see 
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FIG. 2. Full map of the Chinese character network. For a 
better visual demonstration, we plot here the minimum span- 
ning tree of the whole network which is shown in blue while 
other links are presented in grey as a background. All char- 
acters can be seen when the figure magnified properly. 




(a) Level (b) Rank of Nodes 

FIG. 3. Topological properties of Chinese character net- 
work, (a) Hierarchical distribution: number of characters at 
each level. The number of characters in each level that have 
no offspings is shown in brown, (b) Node-offspring distribu- 
tion: Zipf plot, where characters are ranked according to their 
number of offsprings. The number of offsprings of a character 
is plotted against the rank of the character. 



Supporting Online Material for details on the num- 
ber of nodes) and 7024 links, plus 15 isolated nodes. Fig. 
[2] is a skeleton illustration of the full map of the network. 

As a DAG, the Chinese character network is hierarchi- 
cal. Starting from the bottom in Fig. [TJ where nodes 
have no incoming links, we can assign a number to a 
character to denote its level: all components of a char- 
acter should have lower levels than the character itself. 
Figj^a) shows the hierarchical distribution of characters 
in the network. The figure shows that the network has a 
small set of radical characters (224 nodes at the bottom 
level, 1) and nearly 94% of the characters lie at higher- 
levels. Moreover, the network has a broad heterogeneous 
offsprings degree distribution (a node's offspring degree 
is defined as its number of outgoing edges) . Notice in Fig. 
[3]jb), the number of characters with more than one (the 
smallest number on the vertical axis) offspring is close to 
1000 (the largest number shown on the horizontal axis). 
This means that less than 1000 of the 3687 characters are 
involved in forming other characters. The other charac- 
ters are simply the top ones in their paths so that no 
characters are formed based on them. Their distribution 
in the different levels is also shown in Fig. [3^. 

Learning Strategy. The heterogeneity of the hier- 
archical structure reflected in the node-offspring broad 
distribution in the Chinese character network suggests 
that learning Chinese characters in a "bottom-up" or- 
der (starting from level 1 characters and gradually climb- 
ing along the hierarchical paths) may be an efficient ap- 
proach. At the level of learning of individual characters, 
Chinese teaching has indeed used this rationale [TJ |2"2"] . 
Other approaches are based on character usage frequen- 
cies, i.e. learning the most used characters, i.e. those 
appearing as the most used words first (Ref. [23] pro- 
vides a critical review of this approach and others) . 

To assess the efficiency of different approaches, which 



is here limited to Chinese characters learning orders, one 
needs a method to measure the learning efficiency. How- 
ever, measuring learning efficiency is not trivial and cur- 
rently, to the best of our knowledge, does not exist. In 
our approach, we regard a learning strategy as more ef- 
ficient if it reaches the same learning goal, i.e. a desired 
number of learned characters or accumulated character 
usage frequencies, with lower learning costs compared to 
other strategies. 

The question thus becomes how to determine the learn- 
ing cost? Of all possible factors related to cost, it is 
reasonable to assume that a character with more sub- 
characters and more unlearned sub-characters is more 
difficult to learn. For example, the character ' ffil ', with 
5 sub-characters, is obviously more difficult to learn than 
' P ', with 2 sub-characters. Conversely, it is easier to 
learn a character for which all sub-characters have been 
learned earlier than another character with same number 
of sub-characters all of which are previously unknown to 
the learner. We thus intuitively define the cost for a 
student to learn a character as the sum of the number of 
sub-characters and the learning cost of the unlearned sub- 
characters at his current stage. The learning cost of the 
unlearned sub-characters is calculated recursively until 
characters at the first level are reached or until all sub- 
characters have been learned previously. Each unlearned 
character of the first level contributes cost 1, while previ- 
ously learned characters contribute cost 0. For example, 
assuming that, at a given stage, a student needs to learn 
the character ' fit ' and that the student already knows 
the characters in blue in Fig. [T] We demonstrate the cost 
for the student to learn this character. First, the charac- 
ter ' HI ' has 2 sub-characters (' % 'and ' ^ '), and the 
student does not know one character, ' ^ '. The total 
cost of learning the character ' ft ' is thus equals to 2 
plus the cost of learning ' ^ ', which, calculated using 
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the same principle, is 2 (2 sub-characters ' ^ ' and ' \ ' 
, and none of which are new to the student). The cost 
for the student is thus 4. If the student somehow learned 
the character ' ^ ' before and then needs to learn ' ft ', 

the cost of acquiring ' II ' is only 2. Thus, to learn both 

characters, it is cheaper to first learn ' ^ ' and then ' ft ' 
(total cost 2 + 2 = 4), rather than the other way around 
(4 + 2 = 6). 

If we assume that learning more characters, indepen- 
dent of their usage frequency, is the learning goal, the op- 
timal learning strategy is to follow the node-offspring or- 
der (NOO) from many to few, which means learning char- 
acters with more offspring first. In this way, an ancestor 
character is always learned before its offspring characters 
since the ancestor has at least one more offspring than 
the offspring character. From the learning cost defini- 
tion, we know that using this approach we never waste 
effort in learning characters twice. No other strategy is 
thus better than this one. However, in this way we might 
learn many characters with low usage frequencies which 
are less useful. Hence, as shown in Fig. |4]d, if our aim is 
acquiring more accumulated usage frequency, the NOO- 
based strategy is indeed not a good one. Being able to 
achieve a high accumulated usage frequency in relatively 
short times is not only good for those who can not spend 
much time but it will also help the students to do ex- 
tracurricular reading. 

Thus, our main objective is to develop a learning strat- 
egy that reaches the highest accumulated usage frequency 
with limited cost. When simply following the charac- 
ter usage frequency order (UFO method) from high to 
low, one discards topological relations among characters 
that could help in the learning process and save cost. In 
UFO one learns characters at higher levels before learn- 
ing those at lower levels, which is more costly. Thus, 
the question comes to developing a new Chinese char- 
acter centrality measure of character importance, that 
considers both topological relations and usage frequen- 
cies. Such a measure could help to obtain a learning 
order better than both NOO and UFO. One additional 
consideration is to learn first the characters with larger 
out degree in the character network since here a large out 
degree means the character is involved as a component in 
many characters. The method proposed in the following 
in fact takes all these three aspects into consideration. 

Here we develop a centrality measure that we call dis- 
tributed node weight (DNW) based on both network 
structure and on usage frequencies which are the node 
weights (Wj™^ ). Here j represents the node (character) 
and m its level in the network. The top level is m = 5 
(no outgoing links) and the bottom level is m = (no 
incoming links) . To measure character centrality of node 
j at level m, we pick each of its predecessors (denoted 
as node i at level m + 1) and add its weight W^ 1 
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FIG. 4. Learning efficiency comparison for different learning 
orders: node-offspring order (NOO), usage frequency order 
(UFO), distributed node weight (DNW) and two common em- 
pirical orders (EMI for Chinese pupils and EM2 for LCSL). 
(a) Number of characters is set as the learning goal, (b) Ac- 
cumulated usage frequency is set as the learning goal. C m in is 
defined as the learning cost of 1775 characters using the NOO 
method and it will be used in discussion of leaning efficiency 
index. 



multiplied by b to the weig ht Wj m} as follow: 



(m+l) 



U) 



where b > is a parameter, a.y L = 1 or is the adjacency 
matrix element from node j to node i (i.e. whether or not 
character j is a direct part of character i). In the DNW 
method one learns characters in order according to their 
centrality from highest to lowest. Thus, when 6 = 0, the 
DNW is equivalent to the UFO method. For b > 0, the 
node's offsprings play an important role. When 6=1 
and all Wj = 1 (which means ignoring the difference in 
character usage frequencies), the DNW centrality order 
becomes the node-offspring order (NOO). In this sense, 
the NOO is an unweighted version of the DNW. The 
DNW order can thus be considered a hybrid of the NOO 
and UFO. 

Using numerical analysis, we find that the optimal b 
value for the DNW strategy is b ~ 0.35, as discussed 
below. With this optimal parameter 6, we compare our 
strategy of DNW learning order against the NOO and 
the UFO in Figg) We find in Figgja, that DNW is close 
to NOO, regarding the total number of characters vs. 
the learning cost. However, in Fig. |4Jd, the DNW is sig- 
nificantly better than NOO and even better than UFO, 
regarding the total accumulated usage frequency vs. the 
learning cost. In the left panel, NOO and DWN are much 
better than UFO, while in the right panel the UFO and 
DNW are much better than NOO. Thus, only the DNW 
demonstrates a high efficiency in both, accumulated fre- 
quency and total number of characters. 

The DNW in the right figure appears to be only slightly 
better than the UFO, but this is a little misleading. From 
the left figure, we can see that with the same cost, say 
around 1000, although the difference between the two is 
relatively small in the right figure, there is a much bigger 
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difference in the ieft figure. It means that even though 
the DNW is only slightly better than the UFO on the ac- 
cumulated usage frequency, significantly more characters 
are learned following the DNW than the UFO. Such a dif- 
ference in number of known characters sometimes is as 
important as the accumulated usage frequency when es- 
timating if an individual is literate or not. For beginners, 
400 — 500 characters is roughly the first barrier. Many 
stop there. Using the UFO, this corresponds to a cost of 
about 2000 while using the DNW it is around only 1000. 
Thus, it will be much easier for students to overcome this 
barrier when using DNW compared to UFO. 

We next compare the DNW against two empirical com- 
monly used orders: one is from a set of the most used Chi- 
nese textbook for primary schools in China, which 
contains 2475 different Chinese characters (EMI); the 
other is from a mainstream Chinese textbook |25j for stu- 
dents Learning Chinese as a Second Language (LCSL), 
which contains 1775 different Chinese characters (EM2). 
We sort the two character sets by first appearances in 
new character lists in the two textbooks and plot their 
learning results in Fig|4j The figure shows that compared 
to our developed DNW method, the empirical learning 
orders have relatively poor performance in both the total 
number of characters and accumulated usage frequency. 
This emphasizes the urgent need of improving the effi- 
ciency of current learning Chinese characters. 

Optimal b. To find the optimal b value, we define 
an efficiency index for learning strategies. We first take 
a certain learning cost and denote it as C m i n , which 
is here set to be the learning cost of learning the to- 
tal of N min = 1775 characters using the NOO order 
(C m in — 3351, See Fig. [4^,). We intuitively assume that 
the sooner a curve reaches N m i n the learning is more ef- 
ficient. Thus, the larger is the area under the curves in 
Fig. [4Jl the learning can be regarded as more efficient. 
The same consideration holds for the curves in Fig. [4]b. 
We therefore, measure the area underneath the learning 
efficiency curves (Fig|4]) up to cost C m i n and denote them 
as S n (area under the curve of number of characters v.s. 
cost like the ones in Fig. |4^,) and similarly Sf (area un- 
der the curve of accumulated usage frequency v.s. cost 
like those in Fig. |4]d), respectively. The ratio between 
the area underneath the curves S n (Sf) and the area of 
a rectangular region defined by C min N mm (C min F min , 
where F m i„ is the maximum accumulated frequency of 
the curves at C — C m in) is defined as the learning effi- 
ciency index, 
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The sooner a curve reaches N m i n (F m i n ) the larger is the 
area and so is the ratio, the more efficient is the learning 



FIG. 5. Efficient index of hybrid strategies as a function of b 
(dots). The two horizontal lines are the efficiency of the node- 
offspring order (blue line) and usage frequency order (green 
line), (a) Efficiency when using number of characters as the 
learning goal, (b) Efficiency when using accumulated usage 
frequency as the learning goal. 



order. In this sense, the above ratios serve as indexes of 
efficiency of learning orders. 

In Fig. [5j we plot v n and Vf of the hybrid strategy 
(DNW) as functions of b. We also plot two lines, for 
comparison, showing the learning efficiency of the NOO 
(blue line) and UFO (green line). As b increases, v n of 
the hybrid strategy approaches that of the NOO. On the 
other hand, when b = 0.35, Vf of hybrid strategy reaches 
its maximum. Thus, with respect to frequency usage the 
DNW with b = 0.35 is the most efficient. However, if 
we consider also the number of characters the range of 
b G [0.35, 0.7] can be regarded as very good choices. As 
an example, in this work we use b = 0.35, which shows 
a significant improvement over commonly used methods 
(Fig. §. 

In order to compare the DNW strategy against oth- 
ers in more detail, we have analyzed the learning cost 
statistics of the characters covered by cost C m i n for all 
the five learning strategies in Fig. [6] Recall that C m i n is 
the cost of learning first 1775 characters using the NOO 
and number of characters covered by this C m ;„ is differ- 
ent for different methods. Using the measure of learning 
cost proposed earlier, we record the learning cost of ev- 
ery character before the accumulated cost reaches C m i n 
in each learning order and then plot a histogram of learn- 
ing costs of all those characters for each learning order. 
From Fig. [6^, we see that in both DNW and NOO learn- 
ing orders, characters with learning cost 2 are dominant 
(roughly 80%). In these two learning orders, few char- 
acters have learning cost higher than 3. The other three 
learning orders have much smaller fraction of characters 
of cost-2 and more characters with cost higher than 3. 
Most Chinese characters can be decomposed into 2 di- 
rect parts, therefore, learning cost 2 means that when 
a character is learned, its parts have been quite often 
learned before. This is natural in the NOO order since it 
is designed that way. However, as seen here it also holds 
in the DNW order, which is the high advantage of the 
DNW order. In Fig. [6b we also plot the corresponding 
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FIG. 6. Up to a fixed total learning cost C m in, for all five 
learning orders, we count and plot the number of characters 
according to their individual learning costs in (a) and con- 
vert the number of characters into the corresponding usage 
frequency in (b). 



usage frequencies of the set of characters with the same 
learning cost. In DNW one learns in fact about 6% less 
characters compared to NOO, but the usage of the char- 
acters learned in DNW is more than 30% higher. Thus 
DNW is significantly better than NOO. We also find that 
although DNW and UFO have comparable overall usage 
frequencies, the DNW is concentrated on the cost-1 and 
cost-2 characters while the UFO is distributed widely on 
characters with learning cost from 1 to 4. This illustrates 
further why our DNW is an efficient learning order in 
both the sense of total number and total usage frequency 
of characters. 

Conclusion and Discussion. We demonstrate the 
potential of network approach in increasing significantly 
the efficiency of learning Chinese. By including charac- 
ter usage frequencies as node weights to the structural 
character network, we discover and develop an efficient 
learning strategy which enables to turn rote learning of 
Chinese characters to meaningful learning. In the Sup- 
porting Online Material, we present an adjacency list 
form of the constructed network; we also list Chinese 
characters order according to our DNW centrality. The 
constructed network might also help design a customized 
Chinese character learning order for students who have 
previously learned some Chinese and want to continue 
their studies at their own paces. Given the information 
about the student's known characters in our network, our 
DNW centrality measure can be adapted to be used in 
finding a specific student oriented optimal learning or- 
der. This goal is completely out of reach of standard 
textbook-based education and it will be especially use- 
ful for Chinese learners that do not study Chinese in a 
formal Chinese school, or study Chinese every now and 
then or using private tutors. We hope that our study will 
lead to develop textbooks applying the DNW learning or- 
der and detailed decomposition of each character. It will 
also be valuable for Chinese learners to have a dictionary 
explaining every character and word simply from a core 
set of small number of basic characters. Note that we are 
not claiming that our decomposition is perfect or that our 



character choice is good enough. These questions are still 
debated in the Chinese character structure fields. There 
are possibly also other topological quantities that might 
be valuable for Chinese learning. Considering our node- 
weighted network, the concept of using the shortest path 
to accumulate the largest node weight in shortest steps, 
clearly differs from the usual shortest path. How these 
quantities are related to Chinese learning is an interesting 
question that we have not discussed in this work. 

Writers, reporters and citizens in China have argued 
that the Chinese textbooks currently used in mainland 
China are going in the wrong direction, and textbooks 
used 70 years ago seem to be more reasonable. Influ- 
enced by English teaching, Chinese teaching indeed be- 
comes increasingly speaking- and listening-oriented |23j . 
Speaking- and listening-oriented approach is a reasonable 
way to learn a phonetic language. However, for Chinese 
- an ideographic language, it results an inefficient learn- 
ing order of Chinese character where structurally com- 
plicated characters are often taught before simpler ones. 
What we are suggesting is that in designing the speaking, 
listening and reading materials, one should utilize the lo- 
gographic relations among Chinese characters and also 
respect the optimal learning order discovered from ana- 
lyzing the character network of the same relation. Only 
using a network analysis can we capture an entire picture 
of a network of these structural relations. 
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SUPPORTING ONLINE MATERIAL 
Data and methods 

Decomposition of Chinese characters 

According to "Liu Shu" (six ways of creating Chinese 
characters), ideally when sub-characters are combined to 
form a character the compound character should be con- 
nected to its sub-characters either via their meanings 



or pronunciations. Thus, Chinese characters are usu- 
ally meaningfully and coherently connected to each other. 
Let us start from the bottom of Fig. 1 in the main text. 

The four characters are " — " (one) , " ^ " (person, big) , 

" *t! "(heart), " % " (water). These characters closely re- 
semble the shapes or characteristics of the objects to 
which they refer, though their forms today might not 
hold as much of a resemblance as their ancient forms. 
One can compare the modern simplified Chinese char- 
acter against their ancient Zhuanti forms in the figures. 
Such characters are called pictographic (Xiangxing) char- 
acters. 

Initially, the character " ft " (sky) refers to the head, 
the primary part of a person, by placing a bar over the 

character "<fr "(person, big). The meaning later devel- 
oped and became the sky, heaven and god, i.e. the pri- 
mary part of everything as ancient Chinese people be- 
lieved. This way of forming new characters from radical 
parts is called "simple" ideogram (Zhishi) or "combina- 
tion character" ideogram (Huiyi) . These two mechanisms 
are in fact slightly different in that the first is based on 
only one radical part, usually with only a very simple 
additional stroke while the second usually involves two 
radical parts. For a character formed by these two prin- 
ciples, its meaning usually can be read out intuitively 
from the combination. For example, the character ' ffl ' 
(forest) mentioned in the introduction of the main text 
follows the principle of "combined" ideogram: it is a stack 

of three ' $ '(tree). However, in this work, we will not 
distinguish the two mechanisms. 

The character ' ^ ' (, ashamed) is a compound char- 
acter of ' ft ' and ' t '. It follows a different princi- 
ple, which later became popular in forming new Chinese 
characters, the so-called pictophonetic formation (Xing- 

sheng). Here, ' ^ ' and ' ft ' have exactly the same pro- 
nunciation, and the meaning of ' ^ ' refers to a psycho- 
logical phenomenon, which was believed to be related 

to ' t ' (heart). The same pictophonetic relation holds 
among ' ' (add), ' ^ 'and ' % ' (water): the first two 
share the pronunciation while the last part ' % ' is re- 
motely connected to the meaning of ' || '. In Fig.l of the 
main text, we also notice that the characters ' * ' and 
' v ' also form the characters ' |^ '(seep). The character 

' ft ' follows also the pictophonetic formation. It is quite 
common that some basic characters are used in quite a 
few composed characters. 

Here we have demonstrated four of the six principles. 
The other two are phonetic loan (Jiajie) and derivative 
cognates (Zhuanzhu). Those two principles are more on 
usage of characters but not on creating new characters. It 
is not our focus of this work to discuss various ways of us- 
ages of Chinese characters. Following the above general 
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principles, our decompositions of characters are based 
primarily on Ref. [11, 12,21], The first is a standard ref- 
erence, where the six principles were first explicitly dis- 
cussed, in Chinese etymology studies, and the last two 
are regarded as developments of the first, mainly due to 
discoveries of new materials, including Oracle characters 
(Jiaguwen) and Bronze characters(Jinwen). 

Starting from 3500 characters, our network ends up 
with a giant component of 3687 nodes and 7025 links, 
plus 15 isolated nodes. Why do we have more nodes 
than the total number of characters we start with? In 
our decomposition, we find some sub-characters beyond 
the set of the most used 3500 characters. Sometimes, 
such sub-characters are just variations of their normal 
forms. The situation becomes more complicated when a 
radical whose corresponding normal form is not within 
the most-used set. In such cases, we add the "never- 
independent characters" as extra nodes in the network. 
For example, ' ^ ' is such a rarely used character, but we 
keep it in our network. 

See Fig. 2 in the main text for the full map of struc- 
tural relations among Chinese characters. 

Additional explanation of definition of learning cost 

We define the learning cost of a character for a stu- 
dent to be the sum of the number of sub-characters and 
the learning cost (calculated recursively) of the unlearned 
sub-characters at his current stage. The recursive def- 
inition seems to imply that when a student is learn- 
ing a compound character, he has to recognize first the 
sub-characters. However, the dynamic process is only 
a fictitious process used to represent the difficulty that 
the student faces in learning the character. It does not 
means the learning process is indeed as such. Recall from 
the main text total cost of learning ' || ' before ' ^ ' is 
4 = 2 + 2, which is from the fact that it has 2 sub- 
characters and also from the fact that cost of learning the 
unknown ' ^ ' is 2. Therefore, determining cost of learn- 
ing ' II ' first obviously involves cost of learning ' ^ '. 
However, this does not imply that the student should 
have known ' ^ ' after acquiring ' f| '. If it happens so 
that the next time the student must learn ' ^ ', then the 

learning cost of ' ^ ' is still 2 even he had learned ' || ' 
before. Thus the total learning cost of the two characters 
following the order of ' || ' — > ' ^ ' is 6. 

Of course, if the student learned the character ' || ' 
meaningfully, i.e. when he learn the character ' || ', he 



indeed learn also the relation between ' || ' and ' ^ ' 

(also the meaning of ' ^ ') explicitly from his books or 
his instructors, then the total cost for him to learn both 
characters is in fact 4 (no cost for learning ' x ')> which 
is the same cost of learning both characters in the or- 
der of ' ^ ' — > ' HI'. Therefore, learning closely con- 
nected characters together at the same time and learn- 
ing them meaningfully would reduce the cost. Therefore, 
one might conclude that our definition of learning cost 
does not apply to such meaningful learning. However, 
for this we would argue that such meaningful learning 
has implicitly used the optimal learning orders, learn- 
ing the two characters simultaneously and meaningfully 
is equivalent to learning them according to the proper 
order. 

Another problem related to our definition of learning 
cost is that we treated the number of sub-characters and 
the cost of unlearned sub-characters equally. This can 
be questioned and should be investigated further. For 
example, one might introduce a parameter to rescale the 
number of sub-characters and then sum the two together. 
For simplicity, we have not yet discussed this issue. Find- 
ing the proper value of such parameters from empirical 
studies and then comparing performance of those learn- 
ing orders again using the new definition of cost should 
be an interesting topic. 



Supplemental Results 

At last, we provide the two important lists of char- 
acters as final results of our network-based analysis of 
Chinese characters. First is the adjacency list of the net- 
work of characters. The first character of every line is the 
starting point of links and all other characters in the same 
line are the ending point of the links, meaning the first 
character is a part of everyone of the other characters. 
Second is the order of Chinese characters listed accord- 
ing to the calculated DNW centrality. This list includes 
all 3500 characters and b = 0.5 is used in the calcula- 
tion of DNW. In the main text, when 1775 characters 
are used as the learning target, we find the optimal value 
of parameter b is b = 0.35. Repeating the same analysis 
for all 3500 characters, we find that learning efficiency is 
higher when b — 0.5 is used instead of b — 0.35. Here 
the list is produced when we consider the whole set of 
most used characters as the learning goal. The lists can 
be downloaded from our own still developing website on 
Chinese learning http://www.learnm.org/data/. 



